Search CORE

60 research outputs found

A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability

Author: Hu Xuming
Liu Aiwei
Wen Lijie
Yu Philip S.
Publication venue
Publication date: 11/03/2023
Field of study

This paper presents the first comprehensive analysis of ChatGPT's Text-to-SQL ability. Given the recent emergence of large-scale conversational language model ChatGPT and its impressive capabilities in both conversational abilities and code generation, we sought to evaluate its Text-to-SQL performance. We conducted experiments on 12 benchmark datasets with different languages, settings, or scenarios, and the results demonstrate that ChatGPT has strong text-to-SQL abilities. Although there is still a gap from the current state-of-the-art (SOTA) model performance, considering that the experiment was conducted in a zero-shot scenario, ChatGPT's performance is still impressive. Notably, in the ADVETA (RPL) scenario, the zero-shot ChatGPT even outperforms the SOTA model that requires fine-tuning on the Spider dataset by 4.1\%, demonstrating its potential for use in practical applications. To support further research in related fields, we have made the data generated by ChatGPT publicly available at https://github.com/THU-BPM/chatgpt-sql.Comment: 6 pages, 1 figure

arXiv.org e-Print Archive

A Semantic Invariant Robust Watermark for Large Language Models

Author: Hu Xuming
Liu Aiwei
Meng Shiao
Pan Leyi
Wen Lijie
Publication venue
Publication date: 10/10/2023
Field of study

Watermark algorithms for large language models (LLMs) have achieved extremely high accuracy in detecting text generated by LLMs. Such algorithms typically involve adding extra watermark logits to the LLM's logits at each generation step. However, prior algorithms face a trade-off between attack robustness and security robustness. This is because the watermark logits for a token are determined by a certain number of preceding tokens; a small number leads to low security robustness, while a large number results in insufficient attack robustness. In this work, we propose a semantic invariant watermarking method for LLMs that provides both attack robustness and security robustness. The watermark logits in our work are determined by the semantics of all preceding tokens. Specifically, we utilize another embedding LLM to generate semantic embeddings for all preceding tokens, and then these semantic embeddings are transformed into the watermark logits through our trained watermark model. Subsequent analyses and experiments demonstrated the attack robustness of our method in semantically invariant settings: synonym substitution and text paraphrasing settings. Finally, we also show that our watermark possesses adequate security robustness. Our code and data are available at https://github.com/THU-BPM/Robust_Watermark.Comment: 16 pages, 9 figures, 2 table

arXiv.org e-Print Archive

Exploring the Compositional Generalization in Context Dependent Text-to-SQL Parsing

Author: Hu Xuming
Li Shuang
Liu Aiwei
Liu Wei
Ma Fukun
Wen Lijie
Yang Yawen
Publication venue
Publication date: 29/05/2023
Field of study

In the context-dependent Text-to-SQL task, the generated SQL statements are refined iteratively based on the user input utterance from each interaction. The input text from each interaction can be viewed as component modifications to the previous SQL statements, which could be further extracted as the modification patterns. Since these modification patterns could also be combined with other SQL statements, the models are supposed to have the compositional generalization to these novel combinations. This work is the first exploration of compositional generalization in context-dependent Text-to-SQL scenarios. To facilitate related studies, we constructed two challenging benchmarks named \textsc{CoSQL-CG} and \textsc{SParC-CG} by recombining the modification patterns and existing SQL statements. The following experiments show that all current models struggle on our proposed benchmarks. Furthermore, we found that better aligning the previous SQL statements with the input utterance could give models better compositional generalization ability. Based on these observations, we propose a method named \texttt{p-align} to improve the compositional generalization of Text-to-SQL models. Further experiments validate the effectiveness of our method. Source code and data are available.Comment: Accepted to ACL 2023 (Findings), Long Paper, 11 page

arXiv.org e-Print Archive

Prompt Me Up: Unleashing the Power of Alignments for Multimodal Entity and Relation Extraction

Author: Chen Junzhe
Hu Xuming
Liu Aiwei
Meng Shiao
Wen Lijie
Yu Philip S.
Publication venue
Publication date: 25/10/2023
Field of study

How can we better extract entities and relations from text? Using multimodal extraction with images and text obtains more signals for entities and relations, and aligns them through graphs or hierarchical fusion, aiding in extraction. Despite attempts at various fusions, previous works have overlooked many unlabeled image-caption pairs, such as NewsCLIPing. This paper proposes innovative pre-training objectives for entity-object and relation-image alignment, extracting objects from images and aligning them with entity and relation prompts for soft pseudo-labels. These labels are used as self-supervised signals for pre-training, enhancing the ability to extract entities and relations. Experiments on three datasets show an average 3.41% F1 improvement over prior SOTA. Additionally, our method is orthogonal to previous multimodal fusions, and using it on prior SOTA fusions further improves 5.47% F1.Comment: Accepted to ACM Multimedia 202

arXiv.org e-Print Archive

RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction

Author: Hu Xuming
Li Shu'ang
Liu Aiwei
Ma Fukun
Meng Shiao
Wen Lijie
Yang Yawen
Publication venue
Publication date: 24/10/2023
Field of study

How to identify semantic relations among entities in a document when only a few labeled documents are available? Few-shot document-level relation extraction (FSDLRE) is crucial for addressing the pervasive data scarcity problem in real-world scenarios. Metric-based meta-learning is an effective framework widely adopted for FSDLRE, which constructs class prototypes for classification. However, existing works often struggle to obtain class prototypes with accurate relational semantics: 1) To build prototype for a target relation type, they aggregate the representations of all entity pairs holding that relation, while these entity pairs may also hold other relations, thus disturbing the prototype. 2) They use a set of generic NOTA (none-of-the-above) prototypes across all tasks, neglecting that the NOTA semantics differs in tasks with different target relation types. In this paper, we propose a relation-aware prototype learning method for FSDLRE to strengthen the relational semantics of prototype representations. By judiciously leveraging the relation descriptions and realistic NOTA instances as guidance, our method effectively refines the relation prototypes and generates task-specific NOTA prototypes. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches by average 2.61%

F_1

across various settings of two FSDLRE benchmarks.Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

A Private Watermark for Large Language Models

Author: Hu Xuming
King Irwin
Li Shu'ang
Liu Aiwei
Pan Leyi
Wen Lijie
Yu Philip S.
Publication venue
Publication date: 02/08/2023
Field of study

Recently, text watermarking algorithms for large language models (LLMs) have been mitigating the potential harms of text generated by the LLMs, including fake news and copyright issues. However, the watermark detection of current text algorithms requires the key from the generation process, making them susceptible to breaches and counterfeiting. In this work, we propose the first private watermarking algorithm, which extends the current text watermarking algorithms by using two different neural networks respectively for watermark generation and detection, rather than using the same key at both stages. Meanwhile, part of the parameters of the watermark generation and detection networks are shared, which makes the detection network achieve a high accuracy very efficiently. Experiments show that our algorithm ensures high detection accuracy with minimal impact on generation and detection speed, due to the small parameter size of both networks. Additionally, our subsequent analysis demonstrates the difficulty of reverting the watermark generation rules from the detection network.Comment: 13 pages, 3 figures, 3 table

arXiv.org e-Print Archive

The scale effects of organometal halide perovskites

Author: Liu Zhe
Tang Aiwei
Zhang Yibo
Zhao Zhenze
Publication venue: MDPI AG
Publication date: 01/11/2023
Field of study

Organometal halide perovskites have achieved great success in solution-processed photovoltaics. The explorations quickly expanded into other optoelectronic applications, including light-emitting diodes, lasers, and photodetectors. An in-depth analysis of the special scale effects is essential to understand the working mechanisms of devices and optimize the materials towards an enhanced performance. Generally speaking, organometal halide perovskites can be classified in two ways. By controlling the morphological dimensionality, 2D perovskite nanoplatelets, 1D perovskite nanowires, and 0D perovskite quantum dots have been studied. Using appropriate organic and inorganic components, low-dimensional organic–inorganic metal halide hybrids with 2D, quasi-2D, 1D, and 0D structures at the molecular level have been developed and studied. This provides opportunities to investigate the scale-dependent properties. Here, we present the progress on the characteristics of scale effects in organometal halide perovskites in these two classifications, with a focus on carrier diffusion, excitonic features, and defect properties

Central Archive at the University of Reading

Directory of Open Access Journals

Exploring the clinical efficacy and mechanism of high-position colon dialysis combined with Traditional Chinese Medicine retention enema in real-world patients with stage 3–5 chronic kidney disease (non-dialysis) based on the theory of the Gut–Kidney axis

Author: Aiwei Wen
Dongxian Xu
Fanyun Shao
Leixiao Zhang
Lin Yang
Si Chen
Tao Shen
Wei Wu
Yanli Deng
Yuhao Hou
Zhen Liu
Publication venue: Frontiers Media S.A.
Publication date: 01/01/2024
Field of study

Background: With societal and economic development, the annual incidence of chronic kidney disease (CKD) is increasing. Current treatments for CKD are limited, and once patients progress to the uraemic stage, it places a significant economic burden on families and society. Based on the “gut–kidney axis” theory and real-world research, this study aims to evaluate the clinical efficacy, safety, and potential mechanism of high-position colon dialysis combined with traditional Chinese medicine (TCM) retention enema in treating stage 3–5 chronic kidney disease (non-dialysis). Additionally, it seeks to identify new therapeutic targets and approaches for CKD treatment.Methods: The TCM decoction was analyzed using Ultra-Performance Liquid Chromatography-Quadrupole-Orbitrap-High Resolution Mass Spectrometry (UPLC-Q-Orbitrap-HRMS). Participants meeting the inclusion criteria were divided into a control group (n = 153) and a treatment group (n = 159) based on their preferences and physicians’ recommendations. Both groups adhered to a high-quality low-protein, low-salt, low-phosphorus, and low-fat diet supplemented with essential amino acids, and were monitored for blood pressure, blood glucose, and blood lipids. The treatment group received high-position colon dialysis combined with TCM retention enemas (administered at least 12 times every other day).Results: Thirteen compounds were identified from the herbs by UPLC-Q-Orbitrap-HRMS. The CKD3–5 treatment group exhibited improvements in blood biochemistry and other laboratory indices, with significant enhancements in renal function-related indices for CKD4 and CKD5 stages (p < 0.05). Following treatment, indoxyl sulfate (IS), endotoxin, and D-lactic acid levels decreased to a certain extent in both groups, with a statistically significant difference observed within the treatment group (p < 0.05). The treatment group displayed a significant reduction in aerobic bacterial colonies, an increase in anaerobic bacterial colonies, a decrease in Escherichia coli colonies, and an increase in Bifidobacterium and Lactobacillus colonies (p < 0.05). No significant changes in colony numbers were observed in the control group.Conclusion: High-position colon dialysis combined with TCM retention enema may serve as an adjuvant treatment for CKD4-5 (non-dialysis), and its mechanism may be related to the reduction of uraemic toxins, improvement of intestinal mucosal barrier function, and regulation of intestinal microecology.Clinical Trial Registration:https://www.chictr.org.cn/, identifier ChiCTR2200062852

Directory of Open Access Journals

Fluorescence Lifetime Imaging ophthalmoscope and Supplement of agonistic TrkB antibody in a glaucoma model

Author: Liu Aiwei
Publication venue
Publication date
Field of study

A General One-Pot Approach to Synthesize Binary and Ternary Metal Sulfide Nanocrystals

Author: Aiwei Tang
Chao Xiong
Mingrui Liu
Xifang Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Abstract A general one-pot approach is developed to synthesize a series of binary metal sulfide nanocrystals (NCs) including PbS, Cu2S, ZnS, CdS, Ag2S, and ternary CuInS2 and CdS:Cu(I) NCs. This synthetic approach involves thermal decomposition of the mixture of inorganic metal salts and n-dodecanethiol (DDT) without pre-synthesis of any organometallic precursors. In this method, layered metal-thiolate compound is formed at the beginning of the reaction and then this intermediate compound is decomposed into small particles, leading to further growth as the reaction time increases. The as-obtained CdS NCs exhibits a broad but weak surface-state emission, and the Cu(I) doping leads to a red-shift of the emission band due to the Cu(I)-related emission. It is expected that this one-pot approach can be extended to prepare multinary metal sulfide NCs

Directory of Open Access Journals